-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change rangelock handling in FreeBSD's zfs_getpages() #16643
base: master
Are you sure you want to change the base?
Conversation
mappedread_sf() may allocate pages; if it fails to populate a page can't free it, it needs to ensure that it's placed into a page queue, otherwise it can't be reclaimed until the vnode is destroyed. I think this is quite unlikely to happen in practice, it was noticed by code inspection. Signed-off-by: Mark Johnston <markj@FreeBSD.org>
As a deadlock avoidance measure, zfs_getpages() would only try to acquire a rangelock, falling back to a single-page read if this was not possible. However, this is incompatible with direct I/O. Instead, release the busy lock before trying to acquire the rangelock in blocking mode. This means that it's possible for the page to be replaced, so we have to re-lookup. Signed-off-by: Mark Johnston <markj@FreeBSD.org>
vm_page_xunbusy(ma[0]); | ||
|
||
lr = zfs_rangelock_enter(&zp->z_rangelock, | ||
rounddown(start, blksz), len, RL_READER); | ||
|
||
zfs_vmobject_wlock(object); | ||
ma[0] = vm_page_grab(object, OFF_TO_IDX(start), | ||
VM_ALLOC_NORMAL | VM_ALLOC_WAITOK | VM_ALLOC_ZERO); | ||
zfs_vmobject_wunlock(object); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry for my possible ignorance in VM, but is count
expected to be 1 here, or why else do we re-grab only ma[0]
page here, not the rest?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, we should handle the full array. In practice we always have count == 1
here because ZFS does not implement VOP_BMAP.
I'm not familiar with the FreeBSD VM layer but it looks like the From the "vm2: serial console" log, https://github.com/openzfs/zfs/actions/runs/11315921698/job/31476777580?pr=16643
|
Thanks, I think I see what's going on there. I didn't see that panic when I ran the ZTS locally, but if I understand the problem correctly, it's due to a race. |
Unconditionally hold the rangelock in zfs_getpages().
Motivation and Context
This is reportedly required for direct I/O support on FreeBSD.
Description
This change modifies zfs_getpages() to uncondtionally acquire the rangelock. To avoid a deadlock, we may need to drop a page busy lock and allocate a new page later on.
How Has This Been Tested?
Smoke testing on FreeBSD.
Types of changes
Checklist:
Signed-off-by
.